WEKA - Experiences with a Java Open-Source Project

نویسندگان

  • Remco R. Bouckaert
  • Eibe Frank
  • Mark A. Hall
  • Geoff Holmes
  • Bernhard Pfahringer
  • Peter Reutemann
  • Ian H. Witten
چکیده

WEKA is a popular machine learning workbench with a development life of nearly two decades. This article provides an overview of the factors that we believe to be important to its success. Rather than focussing on the software’s functionality, we review aspects of project management and historical development decisions that likely had an impact on the uptake of the project.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big Data with ADAMS

ADAMS is a modular open-source Java framework for developing workflows available for academic research as well as commercial applications. It integrates data mining applications, like MOA, WEKA, MEKA and R, image and video processing and feature generation capabilities, spreadsheet and database access, visualizations, GIS, webservices and fast protoyping of new functionality using scripting lan...

متن کامل

MEKA: A Multi-label/Multi-target Extension to WEKA

Multi-label classification has rapidly attracted interest in the machine learning literature, and there are now a large number and considerable variety of methods for this type of learning. We present Meka: an open-source Java framework based on the well-known Weka library. Meka provides interfaces to facilitate practical application, and a wealth of multi-label classifiers, evaluation metrics,...

متن کامل

An Implementation of FP-Growth Algorithm Based on High Level Data Structures of Weka-JUNG Framework

FP-Growth is a classical data mining algorithm; most of its current implementations are based on programming language's primitive data types for their data structures; this leads to poor readability & reusability of the codes. Weka is an open source platform for data mining, but lacks of the ability in dealing with tree-structured data; JUNG is a network/graph computation framework. Starting fr...

متن کامل

AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

The AMIDST Toolbox is a software for scalable probabilistic machine learning with a special focus on (massive) streaming data. The toolbox supports a flexible modeling language based on probabilistic graphical models with latent variables and temporal dependencies. The specified models can be learnt from large data sets using parallel or distributed implementations of Bayesian learning algorith...

متن کامل

Watsonsim: Overview of a Question Answering Engine

The objective of the project is to design and run a system to answer Jeopardy questions, similar to Watson. In the course of a semester, we developed an open source question answering system using the Indri, Lucene, Bing and Google search engines, Apache UIMA, OpenNLP, and Weka among many additional modules. By the end of the semester, we achieved 18% accuracy on Jeopardy questions, and work ha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2010